\chapter{Statistical models}
\label{ch:statistical-models}

\ifpdf
	\graphicspath{{Chapter2/Chapter2Figs/PNG/}{Chapter2/Chapter2Figs/PDF/}{Chapter2/Chapter2Figs/}}
\else
	\graphicspath{{Chapter2/Chapter2Figs/EPS/}{Chapter2/Chapter2Figs/}}
\fi

The purpose of this part is to find out the relation between sets of parameters and average daily stock returns to determine the significant influences across various sectors. This report is separated in three scenarios associated with different inputs: fundamental, technical indicators and the mix respectively. In each case, the  model has been examined in linear as well as non-linear approaches, illustrated by stepwise regression, tree-based method, and neural network.

\section{Methodology}

\subsection{Stepwise linear regression}
Stepwise regression is a systematic method for adding and removing terms from a multi-linear model based on their statistical significance in a regression. We implement the forward stepwise which begins with an initial model and then compares the explanatory power of incrementally larger and smaller models. At each step, the p-value of an F-statistic is computed to test models with and without a potential term. If a term is not currently in the model, the null hypothesis is that the term would have a zero coefficient if added to the model. If there is sufficient evidence to reject the null hypothesis, the term is added to the model. The method proceeds as follows:
	\begin{itemize}
	\item Fit the initial model.
	\item If any terms not in the model have p-values less than an entrance tolerance, add the one with the smallest p-value and repeat this step; otherwise, go to the next step.
	\item The method terminates when no single step improves the model.
	\end{itemize}
	
Depending on the terms included in the initial model and the order in which terms are moved in and out, the method may build different models from the same set of potential terms. There is no guarantee, however, that a different initial model or a different sequence of steps will not lead to a better fit. In this sense, stepwise models are locally optimal, but may not be globally optimal. The default entrance and exit tolerance in Matlab is 0.05 and 0.10 respectively.

\subsection{Tree-based method}
Tree is classification method that partition terms 


\section{Financial statistical model}
Our inputs are quarterly financial ratios listed in chapter 1. In preparation works, we wipe out unavailable information and then apply the quantile normalization. The reason for this standardization is that inputs are measured in different units and outliers appear in some range of data. Moreover, the inputs are also collected according to industry because we expect equivalent stocks may provide the same and significant impacts to the dynamics of average ....daily tuong ung voi gap ret.... returns.\\
The time variable in model is adjusted depending on users. In fact, the information gathered by normal investors is always delayed after 1 quarter. For instances, the financial statement of the first quarter is often announced publicly in the third quarter, after all procedure related to accounting and auditing. Conversely, in case of insider who can receive the information easily and faster than the others in the market outside, he may take an advantage and make a decision. It means that we have to build different models for these types of users.

\subsection{Pertinent variables}
Here is the fundamental indicators selected in each method

\subsection{General interpretation of the obtained results}


\section{Technical statistical model}

\section{Hybrid model}


% ------------------------------------------------------------------------

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "../report"
%%% End: 
